Foundations of Machine Learning Frameworks
CSCN8010 - Winter 2024
Professor: Ran Feldesh
Student: Arcadio de Paula Fernandez
![]()
To create the graph below, we will use the plotting library for Python called Matplotlib. As data, we will use the classic Titanic database, containing the number of passengers, age, sex, survivors, etc.
For more information about the Titanic you can access the following link.
The graph is a histogram showing the distribution of the number of passengers and their age.
# Importing several libraries for data visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
# Loading the dataset in seaborn data repository of Titanic
df = sns.load_dataset('titanic')
df.head()
| survived | pclass | sex | age | sibsp | parch | fare | embarked | class | who | adult_male | deck | embark_town | alive | alone | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 3 | male | 22.0 | 1 | 0 | 7.2500 | S | Third | man | True | NaN | Southampton | no | False |
| 1 | 1 | 1 | female | 38.0 | 1 | 0 | 71.2833 | C | First | woman | False | C | Cherbourg | yes | False |
| 2 | 1 | 3 | female | 26.0 | 0 | 0 | 7.9250 | S | Third | woman | False | NaN | Southampton | yes | True |
| 3 | 1 | 1 | female | 35.0 | 1 | 0 | 53.1000 | S | First | woman | False | C | Southampton | yes | False |
| 4 | 0 | 3 | male | 35.0 | 0 | 0 | 8.0500 | S | Third | man | True | NaN | Southampton | no | True |
# The 'age' column was selected and missing values were dropped by using .dropna()
ages = df['age'].dropna()
#The number of bins were set and to create the histogram
n_bins = 30
# Creating the histogram plot
plt.hist(ages, bins=n_bins, edgecolor="white")
# Setting labels and title
plt.xlabel('Age')
plt.ylabel('Number of Passengers')
plt.title('Histogram of Passenger Ages')
# Showing the plot
plt.show()
![]()
The graph below is another histogram showing the distribution of the number of passengers and their age, but now in Seaborn, also a Python data visualization library.
import seaborn as sns
# Loading the dataset in seaborn data repository of Titanic
df = sns.load_dataset('titanic')
df.head()
| survived | pclass | sex | age | sibsp | parch | fare | embarked | class | who | adult_male | deck | embark_town | alive | alone | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 3 | male | 22.0 | 1 | 0 | 7.2500 | S | Third | man | True | NaN | Southampton | no | False |
| 1 | 1 | 1 | female | 38.0 | 1 | 0 | 71.2833 | C | First | woman | False | C | Cherbourg | yes | False |
| 2 | 1 | 3 | female | 26.0 | 0 | 0 | 7.9250 | S | Third | woman | False | NaN | Southampton | yes | True |
| 3 | 1 | 1 | female | 35.0 | 1 | 0 | 53.1000 | S | First | woman | False | C | Southampton | yes | False |
| 4 | 0 | 3 | male | 35.0 | 0 | 0 | 8.0500 | S | Third | man | True | NaN | Southampton | no | True |
plt.figure(figsize=(10, 6))
sns.histplot(data=df, x='age', kde=True, hue='sex')
plt.title('Age Distribution by Gender')
plt.show()
![]()
The graphs below show the number of passengers that survived and died but are now in Plotly Express, also a Python data visualization library.
# Loading the dataset in seaborn data repository of Titanic and saving it in the
titanic_data = sns.load_dataset('titanic')
# Viewing the first 5 rows
titanic_data.head()
| survived | pclass | sex | age | sibsp | parch | fare | embarked | class | who | adult_male | deck | embark_town | alive | alone | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 3 | male | 22.0 | 1 | 0 | 7.2500 | S | Third | man | True | NaN | Southampton | no | False |
| 1 | 1 | 1 | female | 38.0 | 1 | 0 | 71.2833 | C | First | woman | False | C | Cherbourg | yes | False |
| 2 | 1 | 3 | female | 26.0 | 0 | 0 | 7.9250 | S | Third | woman | False | NaN | Southampton | yes | True |
| 3 | 1 | 1 | female | 35.0 | 1 | 0 | 53.1000 | S | First | woman | False | C | Southampton | yes | False |
| 4 | 0 | 3 | male | 35.0 | 0 | 0 | 8.0500 | S | Third | man | True | NaN | Southampton | no | True |
import plotly.express as px
import plotly.offline as pyo
pyo.init_notebook_mode()
#plotly.offline.init_notebook_mode()
fig = px.pie(titanic_data, names='survived', title='Passenger Survival',color_discrete_map={'Not Survived': 'red', 'Survived': 'green'},labels={'SurvivalLabel': 'Survival'})
fig.show()
fig = px.scatter(titanic_data, x='fare', y='age', color='survived', size='fare')
fig.show()
!jupyter nbconvert --to html "C:\Users\arcad\OneDrive\Área de Trabalho\02_Machine Learning Engineer\09_Conestoga_Applied AI_ML\01_Term 1_Conestoga_Subjects\02_Foundations ML_Ran\00_ML_Lab_Sandbox\CSCN8010-labs-Class 2\Class 2_Lab_Arcadio_v3.ipynb" --output-dir ./docs/